Data Quality Enhancement with String Length Distribution
نویسندگان
چکیده
Recently, collectable manufacturing data are rapidly increasing. On the other hand, mega recall is getting serious as a social problem. Under such circumstances, there are increasing needs for preventing mega recalls by defect analysis such as root cause analysis and abnormal detection utilizing manufacturing data. However, the time to classify strings in manufacturing data by traditional method is too long to meet requirement of quick defect analysis. Therefore, we present String Length Distribution Classification method (SLDC) to correctly classify strings in a short time. This method learns character features, especially string length distribution from Product ID, Machine ID in BOM and asset list. By applying the proposal to strings in actual manufacturing data, we verified that the classification time of strings can be reduced by 80%. As a result, it can be estimated that the requirement of quick defect analysis can be fulfilled. Keywords—Data quality, feature selection, probability distribution, string classification, string length.
منابع مشابه
Effect of Solid-State on Load Distribution Transformer Tap-Changer on Power Quality Enhancement
Recently electronic tap-changer has received more attention due to its quick response, better performance and simpler maintenance compared to the mechanical tap-changer. This paper presents the capability of the distribution transformer equipped with an electronic tap-changer for improving power quality. At this end, the analytical computation for determining the compensating limit of electroni...
متن کاملAutomatic Determination of Fiber-Length Distribution in Composite Material Using 3D CT Data
Determining fiber length distribution in fiber reinforced polymer components is a crucial step in quality assurance, since fiber length has a strong influence on overall strength, stiffness, and stability of the material. The approximate fiber length distribution is usually determined early in the development process, as conventional methods require a destruction of the sample component. In thi...
متن کاملEffects of Disc Insulator Type and Corona Ring on Electric Field and Voltage Distribution over 230-kV Insulator String by Numerical Method
Insulator strings with several material and profiles are very common in overhead transmission lines. However, the electric field and voltage distribution of insulator string is uneven which may easily lead to corona, insulators’ surface deterioration and even flashover. So the calculation of the electric field and voltage distribution along them is a very important factor in the operation time....
متن کاملThe (non-)existence of perfect codes in Lucas cubes
A Fibonacci string of length $n$ is a binary string $b = b_1b_2ldots b_n$ in which for every $1 leq i < n$, $b_icdot b_{i+1} = 0$. In other words, a Fibonacci string is a binary string without 11 as a substring. Similarly, a Lucas string is a Fibonacci string $b_1b_2ldots b_n$ that $b_1cdot b_n = 0$. For a natural number $ngeq1$, a Fibonacci cube of dimension $n$ is denoted by $Gamma_n$ and i...
متن کاملVGRAM: Improving Performance of Approximate Queries on String Collections Using Variable-Length Grams
Many applications need to solve the following problem of approximate string matching: from a collection of strings, how to find those similar to a given string, or the strings in another (possibly the same) collection of strings? Many algorithms are developed using fixed-length grams, which are substrings of a string used as signatures to identify similar strings. In this paper we develop a nov...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017